Peptide sequence tags and method of using same

ABSTRACT

Peptide sequence tags are identified and used to produce a class of global antibodies, which recognize all members of the GlnA protein sub-family with uniform specificity, regardless of the species of origin. The tags are used to create antibodies to the major GlnA protein of the Glutamine Synthetase enzyme catalyzing of carbon fixation. The antibodies have a range of applications as diagnostic detection reagents for the major environmental process of ammonia assimilation.

This invention relates to peptide sequence tags used to elicitantibodies, which can be used to detect defined families of proteins.

Natural populations of phytoplankton include representatives of numerousspecies of cyanobacteria, diatoms, green algae and other groups.Nevertheless many of these species share core biochemical pathwayssupporting primary productivity and elemental cycling (Bryant, 1994;Falkowski & Raven, 1997). To assess the gross capacity for key metabolictransformations in aquatic habitats, and to track acclimatory changes inthese capacities, researchers require reagents to quantitatively detectall members of a functional class of enzymes (Bouchard et al., 2002;Schofield et al., 2002), for example the RbcL (RUBISCO) enzymeresponsible for carbon fixation in all photosynthetic organisms.Different members of the organism population win contain somewhatdifferent versions of the RbcL enzyme, which nevertheless share similarcore properties and shared conserved regions (ncbi.nlm.nih.gov).Conventional immunological detection uses an antibody raised against oneparticular protein from one species, which will then bind with variableaffinity to other related proteins depending on their antigenicsimilarity to the initial target molecule (Orellana & Perry, 1992). Thisis problematic because change in immunological signals could resultfrom: a) changes in the level of the targets b) changes in thepopulation composition resulting in shifts in the specific mix of targetmolecules present or c) a combination of (a) and (b).

Conventional antibodies are raised against two classes of proteintargets; namely (a) purified or over-expressed protein from a particularspecies, and (b) a peptide selected to match the sequence of a region ofa particular protein. Such antibodies are generally raised againstproteins from a model species, and show variable cross-reactivity torelated proteins from other species. It is not practical to developindividual antibodies to detect each protein of interest from eachstrain in a population, many of which are poorly characterized orunknown, and even unculturable (Staley & Reysenbach 2002). Most of theprotein families of interest in cyanobacteria and phytoplankton are alsohighly conserved in plants so that the same detection system can meetneeds for standard antibodies in plant sciences.

Thus, a need exists for a set of peptide targets to elicit production ofa set of antibodies to detect key proteins involved, inter alia, inphotosynthetic activity. A system should be able to evenly anduniversally recognize all members of a defined enzyme family orsubfamily based on shared characteristics.

The above defined need is met by the present invention which provides amethod for detecting the presence of members of the target proteinfamily in a sample comprising the steps of:

(a) identifying and obtaining a peptide sequence tag conserved for allsequences of members of the target protein family, and exclusive to thetarget protein family;

(b) assessing the tag for immunogenicity potential

(c) utilizing the tag to elicit the production of antibodies; and

(d) using the antibodies to measure the concentration of members of thetarget protein family in a sample.

The invention also provides a peptide sequence tag selected from thegroup consisting of

SEQ ID NO: 1

SEQ ID NO: 2

SEQ ID NO: 3

SEQ ID NO: 4

SEQ ID NO: 5

SEQ ID NO: 7

SEQ ID NO: 8

SEQ ID NO: 9

SEQ ID NO: 10

SEQ ID NO: 11

SEQ ID NO: 13

SEQ ID NO: 14

SEQ ID NO: 15

In accordance with another aspect of the invention, the inventionprovides a method for detecting the presence of a target protein in asample comprising the steps of:

(a) identifying and obtaining a peptide sequence tag conserved for allmembers of and exclusive to a protein family;

(b) assessing the tag for immunogenicity; and

(c) synthesizing the tag provided it possesses a predetermined level ofimmunogenicity.

(d) utilizing the tag to elicit the production antibodies; and

(e) using the antibodies to provide an indication of proteinconcentration in a sample.

According to yet another aspect, the invention provides a method ofusing a peptide sequence tag for coupling to column matrice materialsfor affinity purification of the global antibodies produced according tothe invention.

According to yet another aspect, there is provided a method ofdeveloping characterized concentration standards for quantitation of theconcentration of target proteins in samples by comparison to theconcentration standards comprising the steps of:

(a) coupling a defined molar quantity of protein carrier molecule to adefined molar quantity of peptide sequence tag selected from the groupconsisting of

SEQ ID NO: 1

SEQ ID NO: 2

SEQ ID NO: 3

SEQ ID NO: 4

SEQ ID NO: 5

SEQ ID NO: 7

SEQ ID NO: 8

SEQ ID NO: 9

SEQ ID NO: 10

SEQ ID NO: 11

SEQ ID NO: 13

SEQ ID NO: 14

SEQ ID NO: 15

(b) subjecting a known molar quantity of the coupled complex from (a) toelectrophoretic separation in parallel SDS-PAGE gel lanes with proteinextracts containing members of the target protein family, followed byelectrophoretic transfer to a membrane

(c) immunodetection using a global antibody produced according to theabove-defined method of the coupled standard and any members of thetarget protein family identified by the peptide sequence tag and

(d) using an immunological signal from the known molar quantity ofcoupled complex as a standard for measuring the molar quantity ofmembers of the target protein family present in the protein extracts.

In a still further aspect, there is provided a method for quantitationof members of the target protein families in multiple samples usingEnzyme-Linked ImmunoSorbent Assay kits using on characterized globalantibodies produced according to the above-defined method andquantitation standards produced according to the method defined in thepreceding paragraph.

In another aspect, there is provided a method for eliciting productionof monoclonal, transgenic or synthetic antibodies using peptide sequencetags produced according to the above defined method and standardimmunological protocols.

In a further aspect, there is provided a method for affinity screeningof libraries of reagents to detect specific reagent binding to a peptidesequence tag produced according to the above defined method andimmobilized on a support matrice; and

testing reagents binding to the immobilized peptide sequence tag foraffinity binding to members of the target protein families.

In a still further embodiment, there is provided a method to useaffinity binding using global antibodies produced according to theabove-defined method to capture target proteins from complex mixturesfor subsequent analyses of the specific sequences of target proteinspresent in the mixture using mass spectroscopy.

In yet another embodiment, there is provided a method to use affinitybinding using global antibodies produced according to the above definedmethod to capture and remove target proteins from complex mixtures tolower interference with detection and analyses of other less abundantprotein classes in proteomics applications such as two dimensionalisoelectric focusing/SDS-PAGE and subsequent mass spectroscopic analysesof protein sequences.

In general terms, the inventors have designed a set of peptide targetsor peptide sequence tags which elicit production of a set of antibodiesfor detecting key proteins involved in photosynthetic productivity. Theinventors chose target protein families based on scientific interest andwide applicability. They have found and aligned sequences from publicdatabases to detect peptide sequence tags of 6-25 amino acids which areconserved in all known members of the target protein family, and usebioinformatic analyses to determine if the conserved peptides are uniqueto the target protein family. The peptide sequence tags are assessed forpotential immunogenicity, feasibility of synthesis, solubility andstability; avoiding sequences that are targets for known or putativepost-translational modification in proteins. Selected peptide sequencetags are synthesized, coupled to a carrier and used to elicit antibodyproduction. The specificity and titre of the antibodies were thenlisted. A set of antibodies increases the utility of the system byallowing comprehensive detection of key molecules in a sample,population or community. The target protein families were selected basedon their core roles in the biosphere and their interest and importancefor environmental research, modelling, and monitoring.

Public sequence databases were scanned for all published sequences ofproteins in a given family (http://www.expasy.ch;http://www.ncbi.nlm.nih.gov) The sequences of all published members ofeach target protein family were aligned (Corpet, 1988). Peptide sequencetags of 6-25 amino acids were selected whose sequences are conserved inall known members of the target protein or sub-family. The peptidesequence tags were tested to determine exclusivity to the target proteinfamily using short-peptide BLAST searches of sequence databases(Altschul et al., 2001). The position of each potential peptide sequencetag in a given protein family was analyzed to ensure it is maintained inthe mature proteins, and to avoid regions of the proteins known orsuspected to undergo post-translational modifications such asglycosylation that could interfere with later antibody recognition ofthe mature proteins. The peptide sequence tags are screened forantigenic potential using peptide property prediction algorithms, and toassess their feasibility for synthesis, solubility and stability basedon amino acid composition. In summary the chosen peptide sequence tagsare conserved in all published members of the defined target proteinfamily or subfamily, do not align significantly with members of otherknown protein families, and have acceptable predicted antigenic andsynthesis properties.

The selected peptide sequence tags are synthesized. The peptide sequencetags are then coupled to appropriate immunogenic carrier molecules,typically Keyhole Limpet Hemocyanin, and the complexes are used toelicit production of IgY antibodies in hens. The IgY fraction isseparated from the eggs of the immunized hens and the fraction isscreened using Enzyme-Linked ImmunoSorbent Assays (ELISA) for binding tothe original peptide target. Each IgY production run generatessufficient antibody for hundreds of thousands of immunodetections.Additional hens can be immunized to generate further comparable antibodypreparations and for pooling of antibody preparations from multiplehens.

Members of the target protein family are extracted from a range ofspecies, separated by SDS-polyacrylamide gel electrophoresis,electrophoretically transferred to membranes and immunoblotting is usedto characterize the binding of the antibodies to a range of members ofthe target protein family. Antibody preparations with good targetaffinity but which show non-specific binding to other proteins aresubjected to affinity purification followed by re-characterization toattempt to improve specificity.

The novel antibodies can be applied to detect major proteins in a rangeof species, including uncharacterized species, with confidence that thedetection affinity of the antibody is standard for all denatured membersof the target protein family. Therefore a quantity standard protein fromone species or a synthetic quantity standard can be used for comparativequantitation of members of the protein family from other species.

DETAILED DESCRIPTION OF INVENTION

Peptide sequence tags designed for eliciting production of globalantibodies binding all members of defined protein families orsubfamilies.

In the following, all peptides are written according to convention fromamino terminus to carboxy terminus using the standard single letteramino acid code. All peptides are based on alignments of proteinsequences accessed through the NCBI (ncbi.nlm.nih.gov) and SwissProt(expasy.ch) public databases. Where present a lower-case “c” indicates aterminal cysteine not present in the original protein family but addedfor chemical coupling to the immunogenic carrier molecule, usuallyKeyhole Limpet Hemocyanin. An upper case terminal “C” represents acysteine present in the original protein, but also used for chemicalcoupling to the immunogenic carrier molecule.

1. PsbA: EVMHERNAHN FPLDc (SEQ ID NO:1) Photosystem II is the ultimatesource of almost all biosynthetic reductant in the biosphere. The PsbA(D1) protein of Photosystem II is rapidly cycled under illumination inall oxygenic photobionts (Aro et al., 1993). Disruptions of PsbA cyclingor losses of PsbA pools are central to loss of Photosystem II functionand consequent photoinhibition of photosynthesis in cyanobacteria, algaeand plants under a wide range of conditions including excess light, lowtemperature and UV exposure (e.g. Bouchard et al., 2002; Campbell etal., 1998). Tracking PsbA pools using the global PsbA antibody elicitedby the PsbA peptide sequence tag can show the functional content ofPhotosystem II in a wide range of samples.

This PsbA peptide sequence tag is absolutely conserved in the PsbAproteins from almost all known oxygenic photoautotrophs, with only minorvariants found in some liverworts. The global antibody raised againstthis PsbA peptide sequence tag has to date been demonstrated tospecifically recognize the PsbA protein from a wide range of speciesincluding plants, red algae, cyanobacteria, green algal lichens and amixed natural phytoplankton community. For example the antibody is beingapplied to a biological oceanography project to study UV acclimation innatural phytoplankton at sites from the Arctic to the Antarctic(Bouchard et al., 2002), and also to a study of seasonal acclimation inlichens (Schofield et al., 2002)

2. RbcL: CLRGGLDFTK DDENINS (SEQ ID NO:2) RbcL (RUBISCO) is thecatalytic subunit of the primary carbon dioxide fixation enzyme in thebiosphere and is present in all photobionts, along with many otherprokaryotic organisms that fix carbon through chemoautotropicmechanisms. The kinetic properties of RbcL are well characterized andthe activity of RbcL limits total carbon dioxide uptake by manycommunities (e.g. Badger & Andrews, 1987; von Cammerer & Quick, 2001).The enzyme has a low turnover rate (low kcat) but because the total fluxof carbon fixation through the enzyme is large in photosyntheticorganisms, the enzyme accumulates to high concentrations (e.g. 5-10% ofextractable protein in cyanobacteria). It is thus a major sink fornitrogen and protein resources in photosynthetic organisms, and isindeed the most abundant protein on earth and a major protein source inthe human diet, either directly through consumption of green plants orthrough contributions to forage feed for animals. Quantitating RbcL thusshows the total capacity for carbon uptake in a sample or community.This RbcL peptide sequence tag is diagnostic of the Type I sub-class ofRUBISCO found in almost all oxygenic photoautotrophic organisms with theexception of dinoflagellates and the marine prochlorophyteProchlorococcus. This RbcL peptide sequence tag is absolutely conservedin all known sequences from cyanobacteria, green algae, liverworts,mosses, conifers, eudicots, and monocots. The RbcL peptide sequence tagis conserved perfectly in some species, but shows minor variants in somespecies of ferns, euglenoids, gamma-proteobacteria, beta-proteobacteria,alpha-proteobacteria. It is present but imperfectly conserved in redalgae, diatoms, cryptomonads, haptophytes and brown algae. The globalantibody raised against this RbcL peptide sequence tag has to date beendemonstrated to specifically recognize the RbcL protein from a widerange of species including cyanobacteria, green algal lichens, variousplants and a mixed phytoplankton community dominated by diatoms.

IN THE DRAWING

The accompanying drawing shows the results of an immunoblotchemiluminescent detection of RbcL protein in total protein extractsfrom (a) an elm tree, (b) cyanobacterium (Synechococcus sp. PCC 7942),(c) marsh grass (Spartina) and (d) mixed population of marinephytoplankton from the Gulf of St. Lawrence, dominated by diatoms.

Total denatured protein extracts from the four samples were separated bySDS PAGE and electrophoretically transferred to hydrophobic membrane.The membrane was washed with a 1:4000 dilution of the global RbcL IgYantibody fraction (non-affinity purified) using standard immunoblottingprocedures and solutions (Ausubel et al., 1997). The Global RbcLantibody was then detected using a commercial secondary goatanti-chicken IgY antibody conjugated to a horse radish peroxidase enzymelabel. Finally, the areas with bound horse radish peroxidase weredetected using ECL+ (Amersham Pharmacia) chemiluminescent.

The drawing illustrates the broad detection range and examples of thethree main utilities of the new global antibodies raised against peptidesequence tags; namely (a) detection of a major protein from organisms(elm and Spartina) which are relatively uncharacterized at the molecularlevel but which are of ecological interest, (b) detection of the sameprotein from a widely studied model species, the cyanobacteriumSynechcoccus, and (c) detection of the same protein family from a mixedphytoplankton community.

Application (c) is part of a study of natural phytoplankton responses tochanging UVB (Bouchard et al., 2002), where both the absolute level ofthe target protein and the community structure change under UVBexposure, necessitating an antibody with even detection efficiencies forall members of the target protein family.

3. GlnA: cTNSYKRLVP G (SEQ ID NO:3) GlnA or glutamine synthetase is theprimary point for assimilation of inorganic ammonia nitrogen into thebiosphere. During nitrogen assimilation all nitrogen sources areconverted to ammonia, no matter what the original source, and thenassimilated predominately via the activity of glutamine synthetase. Thustracking levels of glutamine synthetase shows the metabolic capacity ofa sample or community for total nitrogen assimilation.

This GlnA peptide sequence tag shows perfect to high conservation inalpha, beta and gamma proteobacteria, enterobacteria, mostcyanobacteria, thermotogales, low GC gram+, euryarchaeotes andcrenarchaeotes. It shows moderate conservation with: aquificales, highGC gram+ (Streptomyces) and Trichodesmium thiebautii (a marinecyanobacteria).

The GlnA peptide sequence tag shows weak and sporadic conservation withglutamine synthetase Type III (GlnN) and with some glutaminyl-tRNAsynthetases (glutamme-tRNA ligase) (GLNRS), but antibodies raisedagainst this peptide sequence tag are not expected to detect theseenzymes. This peptide sequence tag shows no conservation with anyeukaryotic GlnA, and therefore does not react with glutamine synthetasesfrom eukaryotic sources. The global antibody raised against this GlnApeptide sequence tag has to date been demonstrated to specificallyrecognize the GlnA protein from several species of cyanobacteria.

4. NifH: VESGGPEPGV GC (SEQ ID NO: 4) The NifH subunit is a component ofthe unstable nitrogenase enzyme system responsible for biologicalfixation of N₂ to assimilable ammonia. Levels of the NifH protein can beused to track the total potential metabolic capacity for nitrogenfixation in any sample or community. This NifH peptide sequence tag isperfectly or near-perfectly conserved in NifH proteins from all knownorganisms including: alpha, gamma, beta proteobacteria, enterobacteria,cyanobacteria, low GC gram+ bacteria, high GC gram+bacteria,euryarchaeotes.

5. PsaA: CHFSWKMQSD VW (SEQ ID NO: 5) PsaA is a core subunit ofPhotosystem I, a key complex involved in transduction of light tochemical energy in all oxygenic photobionts. Photosystem I participatesin both linear and cyclic electron transport in photoautotrophicorganisms. The molar ratio between Photosystem II and Photosystem Ivaries widely between taxa and under different environmental conditions(Falkowski & Raven, 1997), and is an important factor for inferring theacclimation state and photosynthetic performance of an organism or acommunity. This PsaA peptide sequence tag is specific to the sequence ofthe PsaA core protein of Photosystem I from all known photoautotrophicorganisms, with the exception of a single amino acid mismatch at thethird position in the dinoflagellate Heterocapsa triquetra

6. NirB: HWTGCPNSC (SEQ ID NO: 6) NirB or nitrite reductase catalyzesthe reduction of nitrite to ammonia, which is an obligatory intermediarystep in assimilation of inorganic nitrate into the biosphere. Nitrate isthe dominant source of inorganic nitrogen supporting primaryproductivity in most ecosystems and hence tracking NirB levels show themetabolic capacity for assimilation of this key nitrogen source involvedin eutrophication, agricultural run-off and stimulation of algal bloomsincluding harmful (toxic) algal blooms.

7. RbcL185: KPKLGLSc (SEQ ID NO: 7) This peptide sequence tag isconserved in both Type I and Type II RbcL and hence can be applied toraise antibodies that will recognize both classes of RUBISCO enzyme,including the RUBISCO found in dinoflagellates and the zooxanthellaesymbionts of coral.

8. RbcL185a: KPKLGLSGKN YGRc (SEQ ID NO: 8) This peptide sequence tag isconserved in Type I RUBISCO and could be applied to generate a secondanti-RUBISCO antibody for use in ELISA sandwich assays.

9. RbcL115: DLFEEGSc (SEQ ID NO: 9) This peptide sequence tag isconserved in Type I RUBISCO and could be applied to generate a secondanti-RUBISCO antibody for use in ELISA sandwich assays.

10. NarB: IFAEVGRRLG F (SEQ ID NO: 10) This peptide sequence tag isspecific to the nitrate reductase (NarB) enzyme from cyanobacteria, akey enzyme in nitrate assimilation

11. NifDMo: VSQSLGHHIA ND (SEQ ID NO: 11) This peptide sequence tag isspecific to the NifD subunit of the sub-set of nitrogenases with aniron/molybdenum-based co-factor (as opposed to iron/vanadium or pureiron cofactors).

12. NifKMo: CTTCMAEVIG DDL (SEQ ID NO: 12) This peptide sequence tag isspecific to the NifK subunit of the sub-set of nitrogenases with aniron/molybdenum-based co-factor (as opposed to iron/vanadium or pureiron cofactors).

13. NifKMo: CMAEVIGDDL (SEQ ID NO: 13) This peptide sequence tag is analternate target specific to the NifK subunit of the sub-set ofnitrogenases with an iron/molybdenum-based co-factor (as opposed toiron/vanadium or pure iron cofactors).

14. PsbA1: GRQWELc (SEQ ID NO: 14) This peptide sequence tag is specificto cyanobacterial PsbA1, a form of PsbA expressed in acclimatedcyanobacteria, but not in eukaryotic photobionts (plants and algae).Monitoring this protein can thus track the contribution of acclimatedcyanobacteria to Photosystem II light energy conversion in a mixedcommunity.

15. PsbA2: GREWELc (SEQ ID NO: 15) This peptide sequence tag is specificto cyanobacterial PsbA2, a form of PsbA expressed only in cyanobacteriaexperiencing excitation stress or UVB stress (e.g. Campbell et al.,1998). Monitoring this protein can thus track when a cyanobacterialpopulation is under excitation or UVB stress. It is also specific to thesole constitutive form of PsbA in eukaryotic photobionts (plants andalgae).

16. PsaB: FPCDGPGRGG TC (SEQ ID NO: 16) This peptide sequence tag isspecific to the PsaB core protein of Photosystem I, a key complexinvolved in transduction of light to chemical energy in all oxygenicphotobionts.

REFERENCES

-   Altschul S et al. (2001) http://www.ncbi.nlm.nih.gov/BLAST-   Aro E M et al. (1993) Biochim. Biophys. Acta 1143:113-134-   Ausubel F et al. (1997) Short Protocols in Molecular Biology, Wiley,    New York.-   Badger M R, Andrews T J (1987) Progress in Photosynthesis Research    Vol. III. Martinus Nijhoff Publishers, pp 601-609.-   Bouchard J N et al. (2002) American Society of Photobiology, Quebec,    Canada-   Bryant D (ed.) (1994) The Molecular Biology of Cyanobacteria. Kluwer    Academic.-   Campbell D et al. (1998) Proceedings of the National Academy of    Sciences of the USA 95: 364-369.-   Corpet F (1988) Nucleic Acids Research 16 (22): 10881-10890.-   http://www.expasy.ch SwissProt public database of annotated protein    sequences and accompanying proteomic analysis tools.-   Falkowski P G & Raven J A (1997) Aquatic Photosynthesis. Blackwell    Science.-   http://www.ncbi.nlm.nih.gov Searches for the target protein families    show a range of representatives from different taxonomic groups,    nonetheless sharing key conserved regions and core biochemical    functions.-   Orellana M V & Perry M J (1992) Limnology & Oceanography 478-490-   Schofield S C et al. (2002) in revision.-   Staley J T & Reysenbach A-L (eds.) (2002) Biodiversity of Microbial    Life. Wiley-Liss.-   von Caemmerer, S. & Quick, W. P. (2000) In Photosynthesis:    Physiology and Metabolism, (ed. R. C. Leegood, T. D. Sharkey, and S.    von Caemmerer), Kluwer.

1. A method for detecting the presence of a member of a GlnA protein(glutamine synthetase) in a sample comprising of: (a) identifying andobtaining a peptide sequence tag conserved for GlnA protein, andexclusive to the GlnA protein, wherein said peptide sequence tag is SEQID NO: 3 (b) utilizing the tag to elicit the production of antibodies;and (c) using the antibodies to measure the concentration of a member ofthe GlnA protein in a sample.
 2. The peptide sequence tag SEQ ID NO: 3.